Structural analysis of a chromatin model

Structural models generated in previous tutorials, may also be analyzed in a more deeper way. In this tutorial some classical methods used by structural biologists, and included in TADBit are described.

This methods are mainly designed to be used on single models, we may thus start this tutorial by loading a single structural model:

from pytadbit.imp.impmodel import load_impmodel_from_cmm, load_impmodel_from_xyz

model = load_impmodel_from_cmm('./model.3261.cmm')

General shape of the model

In this section are described some methods to grossly describe the three-dimensional occupancy of a model.

Finding the center of mass and radius of gyration

The first method that allows to quickly understand how dense or compact is a model consists in the calculation of its radius of gyration (see pytadbit.imp.impmodel.IMPmodel.radius_of_gyration()) and its center of mass:

print model.center_of_mass()
print model.radius_of_gyration()
{'y': -11976.696977752474, 'x': 1459.4080388960351, 'z': -5505.949389580002}
2107.38031986

As an extra feature, radius of gyration (or gyradius) can also be seen within chimera:

model.view_model(tool='chimera_nogui', savefig='/tmp/image_model_2.png',centroid=True, gyradius=True)
model.view_model(tool='chimera_nogui', centroid=True, gyradius=True, savefig='/tmp/image_model_2.webm')
../_images/tutorial_7_single_model_analysis_11_0.png

Model length

print model.contour()
100002.254636

The length of the chromatin strand modeled is thus 100002 nm long.

Fitting model into a cube

Find the longest and shortest distance between 2 particles:

print model.longest_axe()
print model.shortest_axe()
7033.67649175
374.538753758

Characterize a cube that includes the model:

print model.cube_side()
print model.cube_volume()
8861.41178682
6.95838983082e+11

Chromatin accessibility: fitting objects inside the model

In order to infer which part of the modeled chromatin can be accessed by an object, like the transcription machinery, TADBit calculates a mesh around the model, and checks for each point of this mesh if an object of a given size can fit.

Here an example revealing the surface of a chromatin strand accessible to a hypothetical protein of 400 nanometers (radius of 200 nanometers):

acc_dots, tot_dots, acc_area, tot_area, acc_vs_inacc = model.accessible_surface(
                200, nump=100, verbose=True, write_cmm_file='./model_mesh.cmm',
                savefig='/tmp/model_mesh.webm', chimera_bin='chimera_nogui')
Accessible surface: 90.34 micrometers^2(17972 accessible times 0.00502654824574 micrometers)
   (17972 accessible dots of 22080 total times 0.00503 micrometers)
 - 81.39% of the contour mesh
 - 71.6% of a virtual straight chromatin (126.17 microm^2)

The function bellows gives an important amount of information.

  • The text printed (when verbose=True), corresponds to some general statistics about the accessibility of the chromatin.
    • In this example 81% of the chromatin is accessible by the hypothetical protein. This number does not only includes particles, but also the edges linking the particles (remember that a particle is a representation of a given locus of DNA). The second percentage printed corresponds to the percentage of accessible chromatin without taking into consideration its folding (or considering a straight strand of chromatin).
    • As stated above, in order to infer the proportion of accessible chromatin, a mesh is drawn around the chromatin strand. This mesh represents all possible position of the hypothetical protein. Information about surface are relative to this mesh, not to the real accessible surface of the chromatin. However the number are proportional, and the percentages conserved.
    • The dots also mentioned in the output are the representation of the mesh, their number is proportional to the nump parameter. The accessibility is measures using this dots, if a dot is distant enough from any point of the chromatin strand, than it is considered as accessible; while if some part of the chromatin lies closer than the radius of the hypothetical protein to one dot, this dot is considered inaccessible as this protein could not fit in its place. See the movie below (generated using the savefig parameter) for a better understanding, dots are displayed in green when they represented possible placement of the hypothetical protein, or in red when the protein would not fit.

In order to measure how “buried” are each particles, the functions returns a list of values (that, in the example above, we store under the acc_vs_inacc variable). This list contains, for each particle, the number of “green dots” and the number of “red dots”. A useful value that is the buried percentage of each particle (100% mean that the particle is completely inaccessible for the given protein).

Following with the example, these number can be obtained using the acc_vs_inacc list:

for i, (acc, ina) in enumerate(acc_vs_inacc):
    print 'particle %3s is %6.2f%% buried'%(i+1, float(ina)/(acc+ina)*100)
particle   1 is   0.00% buried
particle   2 is   0.00% buried
particle   3 is   0.00% buried
particle   4 is   0.00% buried
particle   5 is   0.00% buried
particle   6 is   0.00% buried
particle   7 is   2.27% buried
particle   8 is  39.29% buried
particle   9 is   0.00% buried
particle  10 is   3.45% buried
particle  11 is   0.00% buried
particle  12 is   0.00% buried
particle  13 is   0.00% buried
particle  14 is   0.00% buried
particle  15 is   0.00% buried
particle  16 is   0.00% buried
particle  17 is  19.05% buried
particle  18 is   0.00% buried
particle  19 is   0.00% buried
particle  20 is   0.00% buried
particle  21 is   0.00% buried
particle  22 is   0.00% buried
particle  23 is   0.00% buried
particle  24 is   0.00% buried
particle  25 is   0.00% buried
particle  26 is   6.67% buried
particle  27 is  41.67% buried
particle  28 is  24.49% buried
particle  29 is   0.00% buried
particle  30 is   0.00% buried
particle  31 is   0.00% buried
particle  32 is  18.75% buried
particle  33 is   0.00% buried
particle  34 is  69.57% buried
particle  35 is   2.13% buried
particle  36 is   0.00% buried
particle  37 is   2.00% buried
particle  38 is   0.00% buried
particle  39 is   0.00% buried
particle  40 is   0.00% buried
particle  41 is   0.00% buried
particle  42 is   5.66% buried
particle  43 is   0.00% buried
particle  44 is  66.67% buried
particle  45 is   0.00% buried
particle  46 is   0.00% buried
particle  47 is   4.17% buried
particle  48 is  57.14% buried
particle  49 is  47.27% buried
particle  50 is   3.70% buried
particle  51 is  44.83% buried
particle  52 is   5.66% buried
particle  53 is   0.00% buried
particle  54 is   5.00% buried
particle  55 is   5.56% buried
particle  56 is  38.10% buried
particle  57 is   0.00% buried
particle  58 is   0.00% buried
particle  59 is  32.56% buried
particle  60 is   0.00% buried
particle  61 is   8.11% buried
particle  62 is   0.00% buried
particle  63 is   0.00% buried
particle  64 is  26.53% buried
particle  65 is   0.00% buried
particle  66 is   0.00% buried
particle  67 is  11.90% buried
particle  68 is   0.00% buried
particle  69 is   1.89% buried
particle  70 is   4.55% buried
particle  71 is   0.00% buried
particle  72 is   2.63% buried
particle  73 is   0.00% buried
particle  74 is  17.14% buried
particle  75 is   0.00% buried
particle  76 is   4.35% buried
particle  77 is   0.00% buried
particle  78 is   7.32% buried
particle  79 is  43.33% buried
particle  80 is   0.00% buried
particle  81 is   0.00% buried
particle  82 is   4.88% buried
particle  83 is   0.00% buried
particle  84 is   0.00% buried
particle  85 is   0.00% buried
particle  86 is   0.00% buried
particle  87 is   0.00% buried
particle  88 is   0.00% buried
particle  89 is   0.00% buried
particle  90 is   5.36% buried
particle  91 is  25.00% buried
particle  92 is   0.00% buried
particle  93 is   0.00% buried
particle  94 is   0.00% buried
particle  95 is   5.56% buried
particle  96 is   0.00% buried
particle  97 is   0.00% buried
particle  98 is   0.00% buried
particle  99 is   2.00% buried
particle 100 is   0.00% buried

Note that in this example no particle is 100% buried.

In order to visualize what really mean this result, the mesh can be displayed only around particles, setting the option include_edges to False. In this case, global value of accessibility of the chromatin will change, but the individual statistics of particles will be kept.

In the movie above, are shown this time only the relevant part of the mesh for each particle. Note that only a part of the sphere surrounding particles is displayed, as nearby edges are impeding the protein to come by the given particle. For more details on how the mesh is build refer to the function documentation: pytadbit.imp.impmodel.IMPmodel.accessible_surface()

acc_dots, tot_dots, acc_area, tot_area, acc_vs_inacc = model.accessible_surface(
                200, nump=100, verbose=True, include_edges=False, write_cmm_file='./model_partmesh.cmm',
                savefig='/tmp/model_partmesh.webm', chimera_bin='chimera_nogui')
Accessible surface: 19.46 micrometers^2(3872 accessible times 0.00502654824574 micrometers)
   (3872 accessible dots of 4125 total times 0.00503 micrometers)
 - 93.87% of the contour mesh
 - 15.43% of a virtual straight chromatin (126.17 microm^2)